Decision Tree Based Feature Selection and Multilayer Perceptron for Sentiment Analysis

نویسنده

  • Jeevanandam Jotheeswaran
چکیده

Sentiment analysis plays a big role in brand and product positioning, consumer attitude detection, market research and customer relationship management. Essential part of information-gathering for market research is to find the opinion of people about the product. With availability and popularity of like online review sites and personal blogs, more chances and challenges arise as people now can, and do use information technologies to understand others opinions. In this paper, a Multi-Layer Perceptron (MLP) is used to classify the features extracted from the movie reviews. A Decision Tree-based Feature Ranking is proposed for feature selection. The ranking is based on Manhattan Hierarchical Cluster Criterion In the proposed feature selection; a decision tree induction selects relevant features. Decision tree induction constructs a tree structure with internal nodes denoting an attribute test with the branch representing test outcome and external node denotes class prediction. In this paper, a hybrid algorithm based on Differential Evolution (DE) and Genetic Algorithm (GA) for weight optimization algorithm to optimize MLPNN is proposed. IMDb dataset is used to evaluate the proposed method. Experimental results showed that the MLP with proposed feature selection improves the performance of MLP significantly by 3.96% to 6.56%. Classification accuracy of 81.25% was achieved when 70 or 90 features were selected.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Knowledge discovery using neural approach for SME's credit risk analysis problem in Turkey

This study proposes a knowledge discovery method that uses multilayer perceptron (MLP) based neural rule extraction (NRE) approach for credit risk analysis (CRA) of real-life small and medium enterprises (SMEs) in Turkey. A feature selection and extraction stage is followed by neural classification that produces accurate rule sets. In the first stage, the feature selection is achieved by decisi...

متن کامل

Machine Learning Based Approaches for Prediction of Parkinson’s Disease

The prediction of Parkinson’s disease is most important and challenging problem for biomedical engineering researchers and doctors. The symptoms of disease are investigated in middle and late middle age. In this paper, minimum redundancy maximum relevance feature selection algorithms is used to select the most important feature among all the features to predict the Parkinson diseases. Here, it ...

متن کامل

Thalassaemia classification by neural networks and genetic programming

This paper presents the use of a neural network and a decision tree, which is evolved by genetic programming (GP), in thalassaemia classification. The aim is to differentiate between thalassaemic patients, persons with thalassaemia trait and normal subjects by inspecting characteristics of red blood cells, reticulocytes and platelets. A structured representation on genetic algorithms for non-li...

متن کامل

Effective and extensible feature extraction method using genetic algorithm-based frequency-domain feature search for epileptic EEG multiclassification

In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The f...

متن کامل

A Hybrid Feature Selection by Resampling, Chi squared and Consistency Evaluation Techniques

In this paper a combined feature selection method is proposed which takes advantages of sample domain filtering, resampling and feature subset evaluation methods to reduce dimensions of huge datasets and select reliable features. This method utilizes both feature space and sample domain to improve the process of feature selection and uses a combination of Chi squared with Consistency attribute ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015